A Supernodal Approach to Incomplete LU Factorization with Partial Pivoting

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Supernodal Approach to Sparse Partial Pivoting

We investigate several ways to improve the performance of sparse LU factorization with partial pivoting, as used to solve unsymmetric linear systems. We introduce the notion of unsymmetric supernodes to perform most of the numerical computation in dense matrix kernels. We introduce unsymmetric supernode-panel updates and two-dimensional data partitioning to better exploit the memory hierarchy. ...

متن کامل

Parallel Symbolic Factorization for Sparse LU Factorization with Static Pivoting

In this paper we consider a direct method to solve a sparse unsymmetric system of linear equations Ax = b, which is the Gaussian elimination. This elimination consists in explicitly factoring the matrix A into the product of L and U , where L is a unit lower triangular matrix, and U is an upper triangular matrix, followed by solving LUx = b one factor at a time. One of the main characteristics ...

متن کامل

A Comparison of D and D Data Mapping for Sparse LU Factorization with Partial Pivoting

This paper presents a comparative study of two data mapping schemes for parallel sparse LU factorization with partial pivoting on distributed memory machines Our previous work has developed an approach that incorporates static symbolic factoriza tion nonsymmetric L U supernode partitioning and graph scheduling for this problem with D column block mapping The D mapping is commonly considered mor...

متن کامل

LU Factorization with Partial Pivoting for a Multi-CPU, Multi-GPU Shared Memory System

LU factorization with partial pivoting is a canonical numerical procedure and the main component of the High Performance LINPACK benchmark. This article presents an implementation of the algorithm for a hybrid, shared memory, system with standard CPU cores and GPU accelerators. Performance in excess of one TeraFLOPS is achieved using four AMD Magny Cours CPUs and four NVIDIA Fermi GPUs.

متن کامل

A Case for Malleable Thread-Level Linear Algebra Libraries: The LU Factorization with Partial Pivoting

We propose two novel techniques for overcoming load-imbalance encountered when implementing so-called look-ahead mechanisms in relevant dense matrix factorizations for the solution of linear systems. Both techniques target the scenario where two thread teams are created/activated during the factorization, with each team in charge of performing an independent task/branch of execution. The first ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ACM Transactions on Mathematical Software

سال: 2011

ISSN: 0098-3500,1557-7295

DOI: 10.1145/1916461.1916467